AITopics | sagemaker debugger

Collaborating Authors

sagemaker debugger

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Mask RCNN Convergence with PyTorch Lightning and SageMaker Debugger

#artificialintelligenceDec-12-2022, 05:55:07 GMT

MLPerf training times represent the state of the art in machine learning performance, in which AI industry leaders publish their best training times for a set of common machine learning models. But optimizing for training speed means these models are often complex, and difficult to move to practical applications. Last year, we published SageMakerCV, a collection of computer vision models based on MLPerf, but with added flexibility and optimization for use on Amazon SageMaker. The recently published MLPerf 2.0 adds a series of new optimizations. In this blog, discuss those optimizations, and how we can use PyTorch Lightning and the SageMaker Debugger to further improve training performance and flexibility.

artificial intelligence, machine learning, pytorch lightning, (17 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

The science behind SageMaker's cost-saving Debugger

#artificialintelligenceApr-8-2021, 21:06:52 GMT

A machine learning training job can seem to be running like a charm, while it's really suffering from problems such as overfitting, exploding model parameters, and vanishing gradients, which can compromise model performance. Historically, spotting such problems during training has required the persistent attention of a machine learning expert. The Amazon SageMaker team has developed a new tool, SageMaker Debugger, that automates this problem-spotting process, saving customers time and money. For example, by using Debugger, one SageMaker customer reduced model size by 45% and the number of GPU operations by 33%, while improving accuracy. Next week, at the Conference on Machine Learning and Systems (MLSys), we will present a paper that describes the technology behind SageMaker Debugger.

debugger, gradient, neuron saturation, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Utilizing XGBoost training reports to improve your models

#artificialintelligenceApr-8-2021, 16:26:32 GMT

In 2019, AWS unveiled Amazon SageMaker Debugger, a SageMaker capability that enables you to automatically detect a variety of issues that may arise while a model is being trained. SageMaker Debugger captures model state data at specified intervals during a training job. With this data, SageMaker Debugger can detect training issues or anomalies by leveraging built-in or user-defined rules. In addition to detecting issues during the training job, you can analyze the captured state data afterwards to evaluate model performance and identify areas for improvement. This task is made easier with the newly launched XGBoost training report feature.

training job, training report, xgboost training report, (13 more...)

#artificialintelligence

Industry:

Retail > Online (0.40)
Leisure & Entertainment > Sports (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)

Add feedback

Analyzing open-source ML pipeline models in real time using Amazon SageMaker Debugger

#artificialintelligenceMar-3-2021, 00:17:42 GMT

Open-source workflow managers are popular because they make it easy to orchestrate machine learning (ML) jobs for productions. Taking models into productions following a GitOps pattern is best managed by a container-friendly workflow manager, also known as MLOps. Kubeflow Pipelines (KFP) is one of the Kubernetes-based workflow managers used today. However, it doesn't provide all the functionality you need for a best-in-class data science and ML engineer experience. A common issue when developing ML models is having access to the tensor-level metadata of how the job is performing.

kubeflow pipeline, pipeline, sagemaker debugger, (11 more...)

#artificialintelligence

Industry: Retail > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Recap of AWS re:Invent 2020

#artificialintelligenceDec-21-2020, 09:37:04 GMT

This year the annual re:invent conference organized by AWS was virtual, free and three weeks long. During multiple keynotes and sessions, AWS announced new features, improvements and cloud services. Below is a review of the main announcements impacting compute, database, storage, networking, machine learning and development. On the very first day of the conference, Amazon announced EC2 Mac instances for macOS, adding after many years a new operating system to EC2. This is mainly targeted to processes that only run on Mac OS, like building and testing applications for iOS, MacOS, tvOS and Safari.

announcement, application, preview, (13 more...)

#artificialintelligence

Country:

North America > United States > New York (0.05)
North America > United States > Illinois > Cook County > Chicago (0.05)

Industry: Information Technology > Services (0.35)

Technology:

Information Technology > Data Science (0.71)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

New – Profile Your Machine Learning Training Jobs With Amazon SageMaker Debugger

#artificialintelligenceDec-8-2020, 16:38:04 GMT

Today, I'm extremely happy to announce that Amazon SageMaker Debugger can now profile machine learning models, making it much easier to identify and fix training issues caused by hardware resource usage. Despite its impressive performance on a wide range of business problems, machine learning (ML) remains a bit of a mysterious topic. Getting things right is an alchemy of science, craftsmanship (some would say wizardry), and sometimes luck. In particular, model training is a complex process whose outcome depends on the quality of your dataset, your algorithm, its parameters, and the infrastructure you're training on. As ML models become ever larger and more complex (I'm looking at you, deep learning), one growing issue is the amount of infrastructure required to train them.

amazon sagemaker debugger, sagemaker debugger, training job, (11 more...)

#artificialintelligence

Industry: Retail > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback